CM 
< 

<£> 
00 

o 

CD 
CO 

o 

Q_ 

UJ 



r.iTr.D TV A^-'UCA^TT 



(19) 




Europaische& Patentamt 
European Patent Office 
Office europeen des brevets 



(12) 



(43) Date of publication: 

26.08.1998 Bulletin 1998/35 

(21) Application number: 98301199.0 

(22) Date of filing: 13.02.1998 



(n) EP 0 860 786 A2 

EUROPEAN PATENT APPLICATION 

(51) mtci* G06F 17/30 



(84) Designated Contracting States: 


• Chang, Rong Nickle . 


AT BE CH DE DK ES Fl FR GB GR IE IT LI LU MC 


No. 11 Ossining, New York 10562 (US) 


NL PT SE 


• Ellozy, Hamed Abdelfattah 


Designated Extension States: 


Bedford Hills, New York 10507 (US) 


AL LT LV MK RO SI 


• Prager, John Martin 




Hackensack, New Jersey 07601 (US) 


(30) Priority: 24.02.1997 US 804599 


• So, Edward Choi chin 




Flushing, New York 11358 (US) 


(71) Applicant: International Business Machines 




Corporation 


(74) Representative: Zerbi, Guido Maria 


Armonk, N Y. 10504 (US) 


Intellectual Property Department, 




IBM United Kingdom Ltd., 


(72) Inventors: 


Hursley Park 


• Brown, Eric William 


Winchester, Hampshire S021 2JN (GB) 


New Fairfield, Connecticut 0681 2 (US) 





(54) System and method for hierarchically grouping and ranking a set of objects in a query 
context 



(57) Topically relevant objects in an object database 
are first identified using any generally known methods 
to obtain a set of topically relevant objects (topically rel : 
evant set). Parents, and in alternative embodiments oth- 
er ancestors, of one or more of the topically relevant ob- 
jects are identified according to directional structural re- 
lationships that the parents have with respect to the top- 
ically relevant objects. These objects form a set of struc- 



turally relevant objects (structurally relevant set). In 
some embodiments, the user query identifies one or 
more of these structural relationships. The topically rel- 
evant objects are then organized under one or more of 
their respective parents to form a hierarchy ievel of both 
(topically relevant and structurally relevant) sets of ob- 
jects. In some preferred embodiments, the process can 
iterate to create more than one hierarchy level. 



Printed by Jouve. 75001 PARIS (FR) 



1 



EP 0 860 786 A2 



2 



Description 

Field of the Invention 

This invention relates to the field of searching and 
navigating a large object collection, particularly a hyper- 
media database in a networking environment. More 
specifically, the invention relates to a system and meth- 
od for generating grouped hierarchical views (with rank- 
ing) for a set of (hypermedia) objects in a query context 
based on one or more relationships. 

Background of the Invention 

A hypermedia object database is a collection of hy- 
permedia objects stored electronically as tiles on one or 
more computers. Hypermedia objects contain informa- 
tion in the form of text, images, sound, or video. Hyper- 
media objects may also participate in relationships, 
where a relationship identities one or more hypermedia 
objects that are related somehow (a hypermedia object 
may be related to itself). One common relationship is 
the hyperlink relationship. Two hypermedia objects are 
related by a (directed) hyperlink relationship if one of the 
objects contains a hyperlink pointer to the other object. 
A hyperlink pointer is a reference to a hypermedia object 
that allows the target object to be accessed directly from 
the source object. 

Users access a hypermedia object database to lo- 
cate objects of interest and retrieve those objects for 
processing (e.g., reading, viewing, listening, analysis). 
Finding objects of interest by manually inspecting every 
object in a large database is impractical. Instead, users 
typically search the database for interesting objects us- 
ing a search system. A search system allows a user to 
express an information need in the form of a query. The 
system's search engine processes the query and re- 
turns to the user a hit-list of relevant objects. The user 
then selects interesting objects from the hit-list and re- 
trieves those objects. 

A relational database management system (RD- 
BMS) may be used to index and search arbitrary hyper- 
media objects based on their attributes. Attributes in- 
clude items such as size, creation date, author, and title. 
Searching for objects in this fashion is weli known. In 
addition to attribute-based searching, users may want 
to search for hypermedia objects based on their content. 
The algorithms and data structures used by a content- 
based search system depend on the kind of object being 
searched. Text objects are typically searched using an 
information retrieval (IR) system (e.g., IBM SearchMan- 
ager/2, a trademark of the IBM Corporation). Image ob- 
jects are typically searched using an image indexing and 
retrieval system (e.g., IBM QBIC, a trademark of the IBM 
Corporation). Content-based search techniques for vid- 
eo and sound exist and have been incorporated into pro- 
totype systems, but this technology is less mature than 
text and image search. Objects found using an attribute- 



based or content-based search system are said to be 
"topically relevant* to the query. 

Some prior art content-based search systems at- 
tempt to improve the search results for hypermedia ob- 
5 ject databases by refining object relevance scores 
based on the structural relationships (e.g.. hyperlinks) 
between the objects. Three representative techniques 
are used by these systems The first technique is a form 
of "spreading activation," where object relevance scores 
io are propagated along outbound hyperlink pointers to 
neighbouring objects and used to modify the relevance 
scores of those objects (see Cohen, P. R. f and Kjeldsen. 
R. "Information Retrieval by Constrained Spreading Ac- 
tivation in Semantic Networks," Information Processing 
'5 & Management, 23(2), pp. 255-266, 1987; Savoy, J. "Ci- 
tation Schemes in Hypertext Information Retrieval," in 
M. Agosti and A. Sme'alon {Eds.), Information Retrieval 
and Hypertext, Boston, Kluwer Academic Publishers, 
pp. 99-120, 1996). This procedure is typically iterated 
until a steady-state is reached or some terminating con- 
dition is met. The objects are then sorted by their final 
relevance scores and returned on a flat hit-list {i.e., the 
hit-list simply enumerates the objects without describing 
any structural relationships). 

In the second technique, it is assumed that the hy- 
permedia objects are organized in a given hierarchy, 
such that every object has at most one parent and the 
children of. a given object are explicitly identified. An ob- 
ject's relevance score is then calculated as a function of 
its content-based relevance score and the relevance 
scores of its children. Relevance scores must be prop- 
agated from the leaves of a hierarchy to the root {see, 
Frisse, M. E. "Searching for Information in a Hypertext 
Medical Handbook," Communications of the ACM, 31 
(7), pp. 380^886, 1988). The objects are then sorted by 
their final relevance scores and returned on a flat hit-list. 

In the third technique, the content of neighbouring 
objects is added to the content of the current object 
when determining the relevance score for the current 
object (see Croft et al. "Retrieving Documents by Plau- 
sible Inference: an Experimental Study," Information 
Processing & Management, 25(6), pp.*599-"614, 1989). 
Neighbouring objects are those objects to which the cur- 
rent object contains hyperlink pointers. As in the previ- 
ous two techniques, objects are sorted by their rele- 
vance scores and returned on a flat hit-list. 

Regardless of the search technology being used, 
most search systems follow the same basic procedure 
for indexing and searching a hypermedia object data- 
base. First, the objects to be searched must be input to 
the search system for indexing. Next, attributes and/or 
contents are extracted from the objects and processed 
to create an index. An index consists of data that is used 
by the search system to process queries and identify 
relevant objects. After the index is built, queries may be 
submitted to the search system. The query represents 
the user's information need and is expressed using a 
o,uery language and syntax defined by the search «ys- 
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tern. The search system processes the query using the 
index data tor the database and a suitable similarity 
ranking algorithm, and returns a hit-list ot topically rele- 
vant objects. The user may then select relevant objects 
trom the hit-list lor viewing and processing. s 

A user may also use objects on the hit-list as navi- 
gational starling points. Navigation is the process ot 
moving from one hypermedia object to another hyper- 
media object by traversing a hyperlink pointer between 
the objects. This operation is typically facilitated by a 
user interlace that displays hypermedia objects, high- 
lights the hyperlinks in those objects, and provides a 
simple mechanism for traversing a hyperlink and dis- 
playing the reterent object. One such user interface is a 
Web browser (see below) By navigating, a user may 
find other objects of interest. 

In a networking environment, the components ot a 
hypermedia object database system may be spread 
across multiple computers. A computer comprises a 
Central Processing Unit (CPU), main memory, disk stor- 
age, and software (e.g., a personal computer (PC) like 
the IBM ThinkPad). (ThinkPad is a trademark of the IBM 
Corporation.) A networking environment consists of two 
or more computers connected by a local or wide area 
network (e.g., Ethernet, Token Ring, the telephone net : 
work, and the Internet.) (See lor example, U.S. Patent 
Number 5,371 ,852 to Attanasio et al. issued on Decern- . 
ber 6, 1 994 which is herein incorporated by reference 
in its entirety.) A user accesses the hypermedia object 
database using a client application on the user's com- 
puter. The client application communicates with a 
search server (the hypermedia object database search 
system) on either the user's computer (e.g. a client) or 
another computer (e.g. one or more servers) on the net- 
work. To process queries, the search server needs to 
access just the database index, which may be located 
on the same computer as the search server or yet an- 
other computer on the network. The actual objects in the 
database may be located on any computer on the net- 
work. 

A Web environment, such as the World Wide Web 
on the Internet, is a networking environment where Web 
servers, e.g. Netscape Enterprise Server and IBM Inter- 
net Connection Server, and browsers, e.g. Netscape 
Navigator and IBM WebExplorer, are used. (Netscape 
Navigator is a trademark of the Netscape Communica- 
tions Corporation and WebExplorer is a trademark of the 
IBM Corporation.) Users can make hypermedia objects 
publicly available in a Web environment by registering 
the objects with a Web server. Moreover, users can cre- 
ate arbitrary relationships between these objects, even 
if the objects were created by another user. Other users 
in the Web environment can then retrieve these objects 
using a Web browser. The collection of objects retriev- 
able in a Web networking environment can be consid- 
ered as a large hypermedia object database. 

To create an index lor a hypermedia object data- 
base in a Web networking environment, the prior art of- 



ten uses Web crawlers, also called robots, spiders, wan- 
derers, or worms (e.g., WebCrawler, WWWWorm), to 
gather the available objects and submit them to the 
search system indexer. Web crawlers make use of the 
(physical) hyperlinks stored in objects. All of the objects 
are gathered by identifying a few key starting points, re- 
trieving those objects for indexing, retrieving and index- 
ing all objects referenced by the objects just indexed (via 
hyperlinks), and continuing recursively until all objects 
reachable from the starling points have been retrieved 
and indexed. The graph of objects in a Web environment 
is typically well connected, such that nearly all ot the 
available objects can be found when appropriate start- 
ing points are chosen. 

Having gathered and indexed all of the objects 
available in the Web environment, the index can then 
be used, as described above, to search for objects in 
the Web. Again, the index may be located independently 
of the objects, the client, and even the search server. A 
hit-list, generated as the result of searching the index, 
will typically identify the locations of the relevant objects 
on the Web, and the user will retrieve those objects di- 
rectly with their Web browser. 



When searching a hypermedia object database, the 
prior art tails to create a hierarchically structured search 
result based on arbitrary relationships between the ob- 
jects in the database. Some prior art systems consider 
relationships when ranking objects, but the final result 
in most of these systems is still a simple hit-list (compa- 
rable to systems that ignore relationships altogether) 
that conveys no information about the relationships be- 
tween the objects. Those prior art systems that can dis- 
play a hierarchically structured search result require that 
a single, pre-determined hierarchy be specified for all of 
the objects in the hypermedia database. For large da- 
tabases (e.g., the World Wide Web), this requirement is 
impossible to satisfy, and in general, this requirement is 
restrictive and inflexible. 

No prior art system fully exploits the relationships 
between objects in a hypermedia object database to 1 ) 
produce a search result that includes both topically and 
structurally relevant objects ("structurally relevant" ob- 
jects are objects that may not be topically relevant, but 
are still relevant to the query due to their structural re- 
lationships with topically relevant objects), and 2) 
present the search results such that the end user can 
easily identify and exploit the relationships between the 
objects on the hit-list. 

The prior art performs even worse in a networked 
multi-user environment, e.g. the World Wide Web, 
where documents are created by multiple authors and 
are poorly cross referenced. For example, when a hy- 
pertext document is created by a single author, its pages 
are typically well cross referenced and provide links to 
next, previous, parent, index, and table of contents pag- 
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es. Even though a prior art search system ignores these 
links and identifies only topically relevant pages in the 
document, since the document is well cross referenced, 
the user can navigate to structurally relevant pages in 
the document by following the cross reference links. 
However, the user does not have this option in a net- 
worked multi-user environment where a second, inde- 
pendent author could create a related document with hy- 
pertext links to the document created by the first author. 
A relationship now exists between these two docu- 
ments, but there are no good cross references from the 
first document to the second document, preventing the 
user from navigating to the second document. If the sec- 
ond document is not topically relevant to the user's que- 
ry, the user will lail to find the second document even 
though it is structurally relevant to the query due to its 
relationship to the first document. 

It is an object of the present invention to provide a 
technique which alleviates the above drawbacks. 

According to the present invention we provide a 
computer system having one or more memories and 
one or more central processing units, the computer ca- 
pable ot accessing one or more database memories, a 
plurality of objects stored in one or more ot the database 
memories and each object identified by an index, the 
computer system further comprising: a user interface for 
entering a query; a hit list, stored in one or more of the 
memories, the hit list being a topically relevant set of a 
plurality of topically relevant objects, the topically rele- 
vant set being selected from one or more of the data- 
base memories and ranked by a topical search engine 
using the index, the rank indicating a topical relevance 
to the query; a relationship data structure containing in- 
formation about how two or more of the objects are re- 
lated to one another by one or more structural relation- 
ships, the structural relationships being directed; and a 
hierarchical view generator that selects a structurally 
relevant set of the objects, each object in the structurally 
relevant set being a parent of one or more of the topically 
relevant objects according to one or more of the struc- 
tural relationships, each of the topically relevant objects 
being a child of one or more of the parents, the hierar- 
chical view generator organizing each of the child ob- 
jects under one or more of its respective parents to cre- 
ate a directional hierarchy. 

Further according to the present invention we pro- 
vide a method for hierarchically grouping a plurality of 
objects stored in one or more database memories of a 
computer system, comprising the steps of: a) selecting 
a topically relevant set of two or more of the objects; b) 
ranking the objects in the topically relevant set; c) iden- 
tifying one or more structural relationships between one 
or more of the objects in the topically relevant set, being 
children, and one or more of the objects in the database 
memories, being parents, the structural relationships 
being directional from the parent to the child; and d) or- 
ganizing one or more of the children under each of the 
respective parents. 



Summary ot The Invention 

Topically relevant objects are first identified using 
any generally known methods to obtain a set of topically 

5 relevant objects (topically relevant set). Parents, and in 
alternative embodiments other ancestors, of one or 
more of the topically relevant objects are identified ac- 
cording to directional structural relationships that the 
parents have with respect to the topically relevant ob- 

io jects. These objects form a set ot structurally relevant 
objects (structurally relevant set). In some embodi- 
ments, the user query identifies one or more of these 
structural relationships. The topically relevant objects 
are then organized under one or more of their respective 

'5 parents to form a hierarchy level of both (topically rele- 
vant and structurally relevant) sets of objects. In some 
preferred embodiments, the process can iterate to cre- 
ate more than one hierarchy level. 

In alternative embodiments, subsets of the topically 

20 relevant set canoe selected, e.g., by rank, as the topi- 
cally relevant set. Also, subsets of the structurally rele- 
vant set, e.g., by weighting,can be selected as thestruc- 
turally relevant set. For example, the weight can be 
based on the number of hyperlink paths that connect the 

25 ancestor to one or more of the topically relevant objects, 
the strengths of these hyperlink paths, and other at- 
tributes of the ancestor, such as the total number of hy- 
perlinks originating at the ancestor and the topical rele- 
vance of the ancestor. 

30 

Brief Description of the Drawings 

The foregoing and other objects, aspects and ad- 
vantages will be better understood from the following 
3S detailed description of preferred embodiments of the in- 
vention with reference to the drawings that include the 
following: 

Figure 1 is a block diagram of the computing envi- 
40 ronment in which the present invention is used in a 
non limiting preferred embodiment. 

Figure 2, comprising Figures 2A and 28, is a block 
diagram of an example object collection, inparticu- 
45 lar a hypermedia object database. 

Figure 3 is a block diagram of an example hyper- 
media object database that is hierarchically 
grouped by the present invention. 

so 

Figure 4 is a block diagram of an example hyper- 
media object database that is hierarchically 
grouped as defined by a first structural relationship. 

55 Figure 5 is a block diagram of an example hyper- 
media object database that is hierarchically 
grouped as defined by a second structural relation- 
ship. 
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Figure 6, comprising prior art Figures 6A and 68, 
where Figure 6A shows an object catalog, and Fig- 
ure 6B a table of named attribute values for objects 
in the catalog. 

5 

Figure 7 comprises Figures 7A and 7B, where Fig- 
ure 7A shows a structured query and Figure 7B 
shows an object hit-list. 

Figure 6 shows a relationship catalog and several }0 
ol the relationship tables to which the relationship 
catalog refers. 

Figure 9 shows a shows a children table. 

is 

Figure 10 is a flow chart showing the steps of one 
preferred hierarchical view generator process exe- 
cuted by the present invention. 

Figure 1 1 is a flow chart showing the steps of a pre- 20 
f erred process for scoring parent objects based on 
structural relationships 

Figure 12, comprising Figures 12A and 12B, are 
flow charts showing the steps of a preferred process 2& 
for displaying the results in a ranked hierarchical or- 
der. 

Figure 13 is a screen dump showing a sample out- 
put result of the present invention. 30 

Detailed Description of the Invention 

Figure 1 is a block diagram of the computing envi- 
ronment in which the present invention is used in a non- 3S 
limiting preferred embodiment. The figure shows some 
of the possible hardware, software, and networking con- 
figurations that make up the computing environment. 

The computing environment or system 100 com- 
prises one or more general purpose computers 170, <o 
175, 180, 185, 190, and 195 interconnected by a net- 
work 105. Examples of general purpose computers in- 
clude the IBM Aptiva personal computer, the IBM RISC 
System/6000 workstation, and the IBM POWERparallel 
SP2. (These are Trademarks of the IBM Corporation.) 45 
The network 105 may be a local area network (LAN), a 
wide area network (WAN), or the Internet. Moreover, the 
computers in this environment may support the Web in- 
formation exchange protocol (HTTP) and be part of a 
local Web or the World Wide Web (WWW). Some com- bo 
puters (e.g., 195) may occasionally or always be discon- 
nected 196 from the network and operate as stand- 
alone computers. 

Hypermedia objects 140 are items such as books, 
articles, reports, pictures, movies, or recordings that SS 
contain text, images, video, audio, or any other multi- 
media object and/or information. One or more hyperme- 
dia objects are stored on one or more computers in the 



environment. 

To find a particular hypermedia object in the envi- 
ronment, a query (see Figure 7A) is submitted for 
processing to a topical search engine 120 running on a 
computer in the environment The topical search engine 
uses an index 1 30 to identify hypermedia objects that 
are relevant to the query. The topical search engine cre- 
ates an index by indexing a particular set of hypermedia 
objects in the environment, called a hypermedia object 
database 141. A hypermedia object database 141 may 
comprise hypermedia objects located anywhere in the 
computing environment, e.g.. spread across two or 
more computers. The relevant hypermedia objects iden- 
tified by the index are ranked and returned by the topical 
search engine in the form of a hit-list (see Figure 7B). 
The process is well known in the prior art. Examples of 
topical search engines include SearchManager/2 (a 
trademark of the IBM corporation.) 

The result of the search is further processed by sub- 
mitting the query and topical search engine results to a 
novel hierarchical view generator 110. The hierarchical 
view generator 110 uses structural relationships in the 
index 130 (see Figure 8) to analyze (parent selection 
and ranking) the relationships in the database and im- 
prove the search result. The structural relationships are 
stored in the index by the hierarchical view generator at 
indexing time. In some preferred embodiments, the hi- 
erarchical view generator selects structurally relevant 
objects by calculating parent ranks based on the ranks 
of the topically relevant objects (generated by the topical 
search engine) and/or the (weighted) structural relation- 
ships in which the topically relevant objects participate. 
Structurally relevant objects have a structural relation- 
ship (see Figure 8) with one or more of the topically rel- 
evant objects. (Structurally relevant objects may or may 
not be topically relevant.) The hierarchical view gener- 
ator then aggregates topically relevant objects based on 
their relationships and generates ranked hierarchies of 
both the topically relevant objects and structurally rele- 
vant objects to present to the user For convenience, the 
topical search engine 120 and hierarchical view gener- 
ator 1 1 0 are shown here as separate components. Note, 
however, that both systems may be internal compo- 
nents of a general hypermedia object search system. 

Hypermedia objects 1 40 and/or indexes 1 30 on one 
computer may be accessed oyer the network by another 
computer using the Web (http) protocol, a networked file 
system protocol (e.g., NFS, AFS), or some other proto- 
col. Services on one computer (e.g., topical search en- 
gine 1 20) may be invoked over the network by another 
computer using the Web protocol, a remote procedure 
call (RPC) protocol, or some other protocol. 

A number of possible configurations for accessing 
hypermedia objects, indexes, and services locally or re- 
motely are depicted in the present figure. These possi- 
bilities are described further below. 

One configuration is a stand-alone workstation 1 95 
that may or may not be connected to a network 1 05. The 
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stand-alone system 195 has hypermedia objects 140 
and an index 1 30 stored locally The stand-alone system 
1 95 also has a topical search engine 1 20 and hierarchi- 
cal view generator 110 installed locally. When the sys- 
tem is used, a query is input to the workstation 195 and 
processed by the local topical search engine 120 and 
hierarchical view generator 110 using the index 130 
The results trom the topical search engine are output by 
the workstation 195. 

A second configuration is 185, a workstation with 
hypermedia objects and indexes connected to a net- 
work 105. This configuration is similar to the stand-alone 
workstation 195, except that 185 is always connected 
to the network 105. Also, the local index 130 may be 
derived from local hypermedia objects 140 and/or re- 
mote hypermedia objects accessed via the network 105, 
and created by either a local topical search engine 120 
and hierarchical view generator 110 or a remote topical 
search engine 120 and/or a remote hierarchical view 
generator 110 accessed via the network 1 05. When que- 
ries are input at the workstation 185, they may be proc- 
essed locally at 185 using the local topical search en- 
gine 120, local hierarchical view generator 110, and lo- 
cal index 130. Alternatively, the local topical search en- 
gine 120 and hierarchical view generator 110 may ac- 
cess a remote index 130 (e.g. on system 175) via the 
network 105. Alternatively, the workstation 185 may ac- 
cess a remote topical search engine 120 and/or hierar- 
chical view generator 110 via the network 105. 

Another possible configuration is 1 75, a workstation 
with an index only. Computer 175 is similar to computer 
185 with the exception that there are no local hyperme- 
dia objects 1 40. The local index 1 30 is derived from hy- 
permedia objects 140 accessed via the network 105. 
Otherwise, as in computer 185, the index 130, topical 
search engine 120, and hierarchical view generator 110 
may be accessed locally or remotely via the network 1 05 
when processing queries. 

Another possible configuration is computer 1 80, a 
workstation with hypermedia objects only. The hyper- 
media objects 140 stored locally at computer 180 may 
be accessed by remote topical search engines 120 and 
hierarchical view generators 110 via the network 105. 
When queries are entered at computer 180, the topical 
search engine 120, hierarchical view generator 110, and 
index 1 30 must all be accessed remotely via the network 
105. 

Another possible configuration is computer 190, a 
client station with no local hypermedia objects 140, in- 
dex 1 30, topical search engine 120, or hierarchical view 
generator 110. When queries are entered at computer 
190, the topical search engine 120, hierarchical view 
generator 110, and index 130 must all be accessed re- 
motely via the network 105. 

Another possible configuration is computer 170, a 
typical web server. Queries are entered at another work- 
station (e.g., 175, 180, 185, or possibly 195) or a client 
station (e.g., 190) and sent for processing to the web 



server 170 via the network 105 The webserver 170 us- 
es a remote topical search engine 1 20, hierarchical view 
generator 110. and index 1 30 (accessed via the network 
105) to process the query. Alternatively, one or more of 
5 these functions (110, 120, and 130) can reside on the 
web server 170. The results are returned to the work- 
station or client station from which the query was origi- 
nally sent. 

Figures 2 and 3 give an intuitive description of the 

to current invention. The current invention operates on an 
object collection, which consists of objects and directed 
relationships between those objects. (The hypermedia 
object database 141 in Figure 1 is an example o1 an ob- 
ject collection.) An object is an identifiable entity that typ- 

15 ically contains data, called the object's content. The na- 
ture of this data depends on the kind of object. For ex- 
ample, a document object would contain text data, while 
an image object would contain image data. 

A directed relationship describes how objects in the 

20 object collection are related to each other. A directed 
relationship R consists of a set of instances of R, where 
an instance of relationship R, denoted R(o1,o2), indi- 
cates that relationship R holds Irom object o1 to object 
o2. Each relationship instance may optionally have a 

25 weight associated with it, e.g., R(o1,o2,w), where w is 
the weight of the instance of relationship R between ob- 
jects o1 and o2. If R(o1,o2), then object o1 is a parent 
of object o2, and object o2 is a child of object o1 . If object 
o2 can be reached from object o1 by following one or 

30 more instances of relationship R, then there is a path in 
relationship R from object o1 to object o2. The weight 
of a path is a function of, e.g. the sum total, of the 
weights of the relationship instances that make up the 
path. A directed relationship defines a structural organ - 

35 ization of the objects in the collection, therefore a direct- 
ed relationship is also called a structural relationship. 

Note that an undirected relationship is supported by 
representing it as a directed relationship that holds in 
both directions, e.g,., undirected relationship U between 

40 object o1 and object o2 would be represented by two 
instances of the directed relationship LT, e.g., U'(o1 ,o2) 
and ir(o2,o1 ). 

An example of a directed relationship is the hyper- 
link relationship H. If object ol contains a hyperlink point- 

45 er to object o2, then H(o1 ,o2). Other examples of direct- 
ed relationships include the subclass or categorization 
relationship, the geographic location relationship, the lo- 
cation within a file system relationship, the conceptual 
relationship, and the is-a-part-of relationship. (Note that 

50 this invention applies to all sets of general objects with 
directed relationships <or "bi-directional - representa- 
tions of undirected relationships), i.e. structural relation- 
ships. In many parts of this disclosure, these general 
objects are referred to as hypertext objects with struc- 

55 tural relationships being hyperlink relationships. This is 
done for convenience without loss of generality.) 

Figure 2, comprising figures 2A and 2B, is a block 
diagram of an example object collection, in particular a 
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hypermedia object database 50 comprising a set ol hy- 
permedia objects, typically A-J, with hyperlinks, typically 
15. between them. The hyperlinks define a structural re- 
lationship on the hypermedia objects in the database. 
Note that this is just one possible example object col- 
lection. Many other kinds ot object collections exist with 
many different kinds of objects and many different rela- 
tionships, all of which are within the domain of the cur- 
rent invention. Figure 2A shows the initial state of the 
hypermedia object database. Figure 2B depicts the da- 
tabase after a topical search engine 120 has been used 
to search the database. The topical search engine an- 
alyzes the contents of the hypermedia objects to identify 
the topically relevant objects, e g: C-G, which are shad- 
ed in Figure 2B. At this point, generally known topical 
search engines 1 20 would present this result to the user 
by simply reporting objects C-G in relevance rank order 
on the hit-list. 

Figure 3 is a block diagram of the same example 
hypermedia object database shown in Figure 2A, with 
the addition of an example search result (differently 
shaded) produced by the current invention (the hierar- 
chical view generator 110). When searching the hyper- 
media object database 5, topically relevant objects, C- 
G, are first identified using any general topical search 
engine, as shown in Figure 2B: 

Then, the hierarchical view generator identities par- 
ent objects, e.g. A and H, of the topically relevant ob- 
jects, C-G, by examining the (hyperlink) structural rela- 
tionship. In one preferred embodiment the hierarchical 
view generator ranks each parent based on the number 
of the hyperlink paths from the parent to the topically 
relevant objects and sometimes factors in a function of 
the weights of those paths. In some embodiments, the 
hierarchical view generator includes only the top ranking 
parents in the search result. The (top ranking) parents 
. are structurally relevant objects, i.e., objects that may 
not be topically relevant, but are still relevant to the que- 
ry due to their structural relationships with topically rel- 
evant objects. Note that topically relevant objects may 
also be structurally relevant 

Thus the invention selects from the database of ob- 
jects a set of one or more topically relevant objects (the 
topically relevant set, C-G) and a set of one or more 
structurally relevant objects (the. structurally relevant 
set, A and H). Here the structurally relevant set includes 
objects that are parents of. the objects in the topically 
relevant set. Note that the parents (optionally the top 
ranking parents) are structurally relevant objects, i.e., 
objects that may not be topically relevant, but are still 
relevant to the query due to their structural relationships 
with topically relevant objects. Note that topically rele- 
vant objects may also be structurally relevant. Further 
note that structurally relevant objects do not have to 
have a parent structural relationship to one or more top- 
ically relevant objects , but can have any ancestral rela- 
tionship defined by the path to the topically relevant ob- 
jects). Also, one or more structural relationships can be 



used. (See below for a more detailed description.) 

The identification of structurally relevant objects by 
the hierarchical view generator provides a number of 
benefits. First, relevant objects that were missed by the 

5 topical search engine are found by the hierarchical view 
generator and returned to the user. The topical search 
engine finds only those objects whose content matches 
the particular query terms selected by the user. In the 
current hypermedia object database example, addition- 

10 al (structurally) relevant objects can be found by consid- 
ering the hyperlink pointers incident on the topically rel- 
evant objects. Hyperlink pointers are typically used to 
cite related objects, direct readers to related objects, or 
provide a path for reading the different components ot 

»5 a hypermedia object. Therefore, the more hyperlink 
pointers an object has to the set of topically relevant ob- 
jects, the more likely it is that the object is relevant to 
the query Moreover, rf an object points to many topically 
relevant objects, that object is likely to contain a more 

20 general discussion of the query topic. 

Second, the structurally relevant objects found by 
the hierarchical view generator are good navigational 
starting points. Good navigational starting points are 
those objects from which many topically relevant objects 

2S can be reached by following one or more directed rela- 
tionship paths from the starting point. In Figure 3, struc- 
turally relevant objects A and H have been shaded, and 
their hyperlinks to the original set of topically relevant 
objects (C-G) have been emphasized with a different 

30 shading, typically 25. It can be seen from the hyperlink 
structure that by starting with either object A or H, all of 
the topically relevant objects (C, D, E, F, and G) in the 
database can be reached by following a single hyper- 
link. Because objects A and H together provide hyper- 

35 link pointers to all of the topically relevant objects, they 
are good navigational starting points for browsing the 
hypermedia object database given the current interest 
of the user. 

■ Third, using the structurally relevant objects, the hi- 
40 erarchical view generator organizes the relevant objects 
into meaningful groups or clusters. Groups are formed 
based on the following observation: a parent object of- 
ten contains information on a particular theme : so its 
children are likely to share that common theme. In Fig- 
4S ure 3, parent objects A and H each provide a grouping 
or clustering of the topically relevant objects. Object A 
contains hyperlink pointers to objects C, D, and E, while 
object H contains pointers to objects C, G, and F, such 
that the topically relevant objects are divided into two 
so (overlapping) clusters. It can be inferred that objects C 5 
D, and E are related by the topic of object A, and objects 
C, G, and F are related by the topic of object H. There- 
fore, it is productive to organize the topically relevant 
objects into these two clusters and present this organi- 
se zation to the user. 

Moreover, the groups created using the structurally 
relevant objects form a hierarchy of structurally and top- 
ically relevant objects. In Figure 3, the dashed hyper- 
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links 25 organize the topically and structurally relevant 
objects into a hierarchy, which is much more meaninglul 
to the user than an arbitrary graph, which is all that is 
known before the hierarchical view generator 110 exe- 
cutes. This hierarchical presentation shows only those 
objects that are structurally and topically relevant to the 
user query. The objects in this hierarchy can be ranked 
turther it necessary. 

Note that by defining different structural relation- 
ships in the object collection (see Figure 8), different 
paths exist and therefore, different sets of (parent) ob- 
jects, i.e., different structurally relevant sets, are identi- 
fied tor any given set of topically relevant objects. Fig- 
ures 4 and 5 explore this further. 

Figure 4 is a block diagram of the same exampie 
hypermedia object database 50 shown in Figure 3, with 
the exception that an alternate result that might be pro- 
duced by the current invention using a different relation- 
ship, e.g., geographic location, is highlighted. For clarity, 
the hyperlink pointers 15 between the objects A-J are 
not shown. Here, the objects participate in geographical 
(structural) relationships, typically X and Y. All objects 
that participate in the X relationship (i.e., are assigned 
to the geographical location described by X) are con- 
nected to X by structural links 41 . All objects that partic- 
ipate in the Y relationship (i.e., are assigned to the ge- 
ographical location described by Y) are connected to Y 
by structural links 42. 

After the topically relevant objects C-G are identi- 
fied by a topical search engine, the hierarchical view 
generator 110 uses the geographical relationship links 
41 and 42 to identify the structural geographical rela- 
tionship parents X and Y X and Y can then be used to 
organize the search results into a hierarchy where top- 
ically relevant objects are grouped by geographic loca- 
tion. 

Figure 5 is a block diagram of the same example 
hypermedia object database 50 shown in Figure 3, with 
the exception that an alternate result that might be pro- 
duced by the current invention using a different relation- 
ship, e.g., category, is shaded. For clarity, the hyperlink 
pointers 15 between the objects A-J are not shown. 
Here, the objects participate in categorization relation- 
ships, typically M-R. All objects that participate in a giv- 
en categorization relationship (i.e., are assigned to that 
category, possibly with some confidence weight) are 
connected to that category by structural links 51 . 

After the topically relevant objects C-G are identi- 
fied by a topical search engine, the hierarchical view 
generator uses the categorization relationship links 51 
to identify category relationship parents N, R and R. 
Even though some topically relevant objects participate 
in multiple categorization relationships, the structural 
links 51 with confidence weights are used as evidence 
to identify N, P, and R as the parents. N, P, and R can 
then be used to organize the search results into a hier- 
archy where topically relevant objects are grouped by 
category. In particular, the structural links used for this 



grouping are highlighted as dashed links 52 in Figure 5. 

In subsequent figures, a number of data -structures 
are described as tables. This is tor convenience of draw- 
ing and description. In an actual implementation, any 

s usual data-structure such as a normal array, an associ- 
ative array, a linked list, a hash table, or any other struc- 
ture may equivalently be used without affecting the in- 
vention described herein. Note, however, thai the tables 
described below can be readily implemented in any gen- 

io eral relational database management system (RD- 
BMS). 

Figure 6, comprising Figures 6A and 6B, is one pre- 
ferred set of prior art data structures in the index 130 
used to implement the present invention. The Object 

'5 Catalog 210 in Figure 6 A stores information about each 
of the objects in the collection. The Object Catalog is a 
table that contains an entry 215 for each object in the 
collection. Each entry 215 contains the following infor- 
mation, represented by columns in the table 210: se- 

20 quence Number 220 (the number or offset of the current 
entry in the table), Object Id 225 (a unique identifier for 
the object that corresponds to the current entry), Object 
Pointer 230 (a reference to the object corresponding to 
the current entry that allows the object to be retrieved), 

2S object Title 235 (the title of the object corresponding to 
the current entry) and a set 240 of Attributes 245 asso- 
ciated with the object corresponding to the current entry. 
In a preferred embodiment, each Attribute245 is a point- 
er to an Attribute Table 250, shown in Figure 6B. Using 

30 a table 250 permits an attribute to be a multi-dimension- 
al value. Each dimension is given a Name 255 and a 
Value 260. Therefore, table 250 has a pair of Name 255 
and Value 260 fields for each dimension of the attribute. 
These data structures are part of the index 1 30 and their 

3S contents would be tilled in when the object collection is 
indexed. 

Figure 7, comprising Figures 7A and 76, is one pre- 
ferred set of data structures used to implement the 
present invention. In Figure 7A, the data structure 310 

40 represents aOuery arranged in an array or list as a se- 
quence of n Query Factors 31 5. Each Query Factor 31 5 
consists of a Query Element 320, an optional Weight 
325 and an optional Connector 330. The Query also 
contains a list 340 of zero or more relationships 345 (see 

45 Figure 8) to be used by the Hierarchical View Generator 
110 (see Figure 10). 

In Figure 7B, a novel Object Hit-list, 350 is used to 
identify in rows 355 the objects out of Object Catalog 
210 that have been selected as the result of a search 

so by Topical Search Engine 120 or processing by Hierar- 
chical ViewGenerator 110, i.e., the topically relevant set 
and the structurally relevant set(s). Each row contains 
fields for theobject's Number in the list 360, its comput- 
ed Score/Rank 365 (assigned by the topical search en- 

ss gine 120 or the Hierarchical View Generator 110 {see 
Figure 11)), its Object Id 370, a Selector 375 .(used to 
mark entries for future processing), and a pointer to a 
Children Table 380 <see Figure 9). The Children field 
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380 is only used when the Object Hit-list 350 is created 
by the Hierarchical View Generator 110. or only used lor 
objects in the structurally relevant set(s). 

Figure 6 is one preferred set of novel data structures 
in index 130 used to describe and catalog structural re- 
lationships in the object collection and implement the 
present invention. By definition, a relationship R is a set 
of ordered tuples (triples) of the form (o1 , o2, w), where 
ol and o2 are objects (and w is an optional weight.) The 
triple (ol, o2, w) indicates that the relationship R exists 
from object ol to object o2 with weight w. For example, 
if relationship R is the hyperlink relationship, (ol , o2, w) 
indicates that object o1 contains a hyperlink pointer to 
object o2 with weight w. In Figure 2A, this is shown as 
a structural link 15 from object ol to object o2. 
Alternatively, il R is the categorization relationship, (o1 , 
o2, w) indicates that object o2 belongs the category ol 
with weight w. In Figure 5, this is shown as a structural 
link 51 from category o1 to object o2. 

The Relationship Catalogue 410 is a table contain- 
ing rows 41 5 for the purpose of cataloguing the relation- 
ships in the object collections and determining which 
particular relationships are to be used to process the 
current query. The fields of the rows consist of Selector 
420, Relationship Name 425, and Relationship Table 
Pointer 435. A Relationship Table 450 contains rows 
455 showing how objects are related to each other by 
the current relationship. The fields of the rows 455 con- 
sist of From-ld 460, To-ld 465, and optional Weight 470 : 
the triple above. Relationship Table 450 also has a Se- 
lector filed 473 used for marking selected entries 455 
for later processing. The Relationship Catalogue is gen- 
erated at database indexing time and is part ol the index 
130. 

Figure 9 is one preferred data structure used to im- 
plement the present invention. Figure 9 shows Children 
Table 580 consisting of entries 590, where each entry 
has an Object Hit-list identifier field 595, a Child Number 
field 585, and a Relation field 575. The Object Hit-list 
identifier 595 identifies the Object Hit-list 350 that con- 
tains an entry for the current child object. The Child 
Number 585 is an index into the Number field 360 (of 
the Object Hit-list 350 identified in 595) identifying which 
entry 355 corresponds to the current child object. The 
Relation field 575 corresponds to Relation Name 425 in 
the Relationship Catalogue 410, and identifies the rela- 
tionship by which the child belongs to the corresponding 
parent. 

Figure 10 is a flowchart showing the method steps 
of one preferred process executed by the present inven- 
tion. By executing the process 600, the system 110 pro- 
duces ranked hierarchical views for a set of objects in a 
query context. In the descriptions of processes 600 and 
700 that follow, the convention is used that if a numerical 
label is used to describe a quantity of like objects, such 
as rows in a table, and if it is necessary to indicate one 
specific such instance, then the numerical label is used 
with an alphabetic suffix, resulting in descriptors such 



as 123a, for example. 

The process begins with a Query 310 entered by 
the user in step 605 and an Object Hit-list 350 of objects 
identified by the user in step 610. In a preferred embod- 

5 iment the Object Hit-list 350, which identifies a subset 
of the objects in Object Catalog 210, is the hit-list of top- 
ically relevant objects (the topically relevant set) gener- 
ated by a topical search engine with Query 310 on Ob- 
ject Collection 141 (i.e., a content-based search). In 
step 615, the objects listed in Object Hit-list 350 are 
ranked with respect to Query 310, with resulting object 
ranking values placed in Rank field 365. In a preferred 
embodiment, the object Ranks 365 are supplied with the 
Object Hit -list 350 in step 61 0, so ranking of the topically 

is relevant objects by the present system is not necessary 
and step 615 is omitted. 

Next, the system enters loop 635 where each iter- 
ation of the loop builds the next higher level of the result 
hierarchy. The loop is controlled by step 635, which tests 

20 whether or not to build another level of the hierarchy. If 
another level of the hierarchy should be built, branch 
636 is taken. If the hierarchy building process is-com- 
plete, branch 637 is taken. The test in step 635 may be 
performed in a variety of ways. In a highly interactive 

25 system, the user may be prompted as to whether or not 
to build another level of the hierarchy. This prompting 
may be accompanied by executing process 900 (de- 
scribed below) to display to the user the current result 
built so far. Alternatively, a predefined system constant 

30 . may determine how many levels of the hierarchy to 
build. 

When branch 636 is taken, the system executes a 
series of steps to add another level to the result hierar- 
chy. Each level of the result hierarchy is stored in an 

35 Object Hit-List 350. The Object Hit-List 350 for the cur- 
rent level is created and initialized in step 620. Initiali- 
zation involves setting all entries in the Object Hit-List 
350 to null. In step 625, the relationships identified by 
the user in 340 (see Figure 7A) are used to select rela- 

40 tionships in the relationship catalogue 410 for generat- 
ing the next level of the hierarchy in subsequent steps. 
For each relationship 345 specified by the user, the Re- 
lationship Name column 425 is searched to find a match 
and the Selector field 420 of the matching entry 415 is 

45 set to 1 . The Selector field 420 of all other entries is set 
to 0. Note that in some preferred embodiments, the re- 
lationships can be default relationships, and query com- 
ponent 340 is not provided by the user. For example, 
the default can be hypertext links. 

50 step 630 begins an iteration over all of the relation- 
ships selected in step 625. Each relationship 415 
marked as selected in field 420 is processed as follows 
in turn, by following branch 631 . The relationship cur- 
rently being processed, called the current relationship, 

55 will be referred to as 415a. When all selected relation- 
ships have been processed, branch 632 is taken. 

Each iteration of steps 630 and 700 produces one 
structurally relevant set defined by a given structural re- 
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lationship, e.g., hypertext link, geographic location, cat- 
egory, etc. For each object in this structurally relevant 
set. it the object already appears in the current Object 
Hit-list 350, its entry 355 is updated, otherwise an entry 
for the object is added to the current Object Hit-list (see 
Figure 11 ). All ot the objects in the current Object Hit-list 
350 define a structurally relevant set based on a com- 
bination ol one or more structural relationships. This 
structurally relevant set is defined for the given level of 
the generated hierarchy (hierarchy level). 

In step 700 (described in detail in Figure 11), the 
Object Hit-list 350 for the current result hierarchy level 
being built is updated for the current relationship. 

After all selected relationships have been proc- 
essed, branch 632 is taken to step 635. At this point, a 
new level of the hierarchy (e.g. parent, grandparent, 
great-grandparent along a path) has been created and 
the objects in that level are identified by the entries 355 
in the Object Hit-list 350 that was created and populated 
by the steps executed in the current iteration of 635. 
Each entry 355 has a Children field 380 that identifies 
the children of the current entry. The Children field 380 
points to a Children Table 580, which contains an entry 
590 for each child of the current object. Each Children 
Table entry 590 has an Object Hit-list field 595 that iden- 
tifies the Object Hit-list 350a (the next lower level of the 
result hierarchy) that contains the child, a Child Number 
585 that identifies the entry 355 in the Object Hit-list 
350a that corresponds to the child object, and the Re- 
lation 575 that identifies the relationship by which this 
child belongs to the current object. 

In step 800, the entries 355 in Object Hit-list 350 are 
sorted according to score 365, and the ranking order is 
entered into Number field 360. The process then iterates 
at step 635. Note that in one embodiment, there is only 
one hierarchy level, i.e., only parents of the topically rel- 
evant set. In this case, there is no iteration step 635 and 
the entries 800 can optionally be displayed 900 after 
step 800. 

In step 900 (described in detail in Figure 12), the 
results are displayed. Note that one or more levels of 
the hierarchy level can be suppressed in the display e. 
g., grandparents are displayed but not parents. 

Figure 11 shows the steps for process 700 in detail. 
Step 700 identifies and ranks parent objects for the giv- 
en child objects based on the current structural relation- 
ship. The parent objects are entered into the Object Hit- 
list 350 for the current result hierarchy level being built 
(i.e., the Object Hit-list 350 created in step 620). The 
given children are identified in a different Object Hit-list 
350, which was either input in step 610 (if this is the first 
iteration of 635), or was created by the previous iteration 
of 635. To avoid confusion, the Object Hit-list for the cur- 
rent result hierarchy level being built (i.e., the parents) 
will be referred to as 350'. 

Note that if this is the first iteration of 635, then the 
current child objects are the topically relevant objects 
identified by the topical search engine. Otherwise, the 



current child objects are the structurally relevant objects 
identified during the previous iteration of 635, and the 
parents currently being identified and ranked are at least 
grandparents of the topically relevant objects. 

5 In step 705 the N top ranking child objects from the 
children Object Hit-list 350 are selected tor further 
processing. This done by setting the Selector field 375 
to 1 for those objects whose Number field 360 is less 
than or equal to N. N may be hard-wired into the system, 

io i.e., a pre-defined value, or selected by the user. In one 
preferred embodiment a value of 10 to 20 would be suit- 
able. The value of N is a trade-off between quality and 
performance. As N becomes larger, result quality will im- 
prove since more child objects are used to generate-the 

'5 hierarchy. However, as N becomes larger, more 
processing is required and performance deteriorates. 
Moreover, there are diminishing returns as N becomes 
larger, since lower ranked child objects are less relevant 
to the query. 

20 In step 710 an iteration over each of the top N child 
Object Hit-list entries 355 whose Selector field 375 is 
set to 1 is performed to find their parents (as defined by 
the current structural relationship). The entry currently 
being processed, called the current object, will be re- 

25 ferred to as 355a. If there are more objects to process, 
branch 712 is followed, otherwise branch 711 is fol- 
lowed. 

In step 725, all of the instances of the current struc- 
tural relationship in which the current child object par- 

30 ticipates (as a child) are selected. The Object kJ 370 for 
the current Object Hit-list entry 355a is used to select 
entries 455 from the Relationship Table 450 for the cur- 
rent relationship 415a (recall that 415a is the current re- 
lationship being processed in step 630, described 

3S above). An entry 455 is selected if its To Id 465 matches 
370. Selected entries 455 are marked by setting their 
Selector field 473 to 1 . The Selector field 473 for all other 
(non -selected) entries is set to 0. 

In step 730 an iteration over each of the entries 455 

40 whose Selector field 473 is set to 1 is performed to proc- 
ess each of the parent objects in the relationship in- 
stances identified in step 725. The entry currently being 
processed will be referred to as 455a. If there are more 
entries to process, branch 732 is taken, otherwise 

45 branch 731 is taken. 

In step 735, the parent object in the current relation- 
ship instance being processed is ranked and its score 
in the parent object hit-list is updated. The parent Object 
Hit-list 350' being created by thecurrent iteration of 635 

so is updated for the current entry 455a. The From 4d 460a 
for the current entry 455a is used to identify the appro- 
priate entry 355'a to update by looking up 460a in the 
Object Id column 370'. If an entry 355'a does not exist 
with Object Id 370' equal to 460a, one is created. The 

ss Score 365'a for entry 355'a is updated by applying for- 
mula F, which takes as input the current Score 355'a, 
the Weight 470a of the current relationship instance 
455a, and the child object's Score 355a (from the child 
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Object Hit-list 350). The Children Table 580 for entry 
355'a is obtained by following the pointer in field 380'. If 
entry 355'a is newly created, a new Children Table 580 
is created. The Number 360a for entry 355a is added to 
Children Table 580 by adding a new entry 590 with its 5 
Child Number 585 set to Number 360a, its Object Hit- 
list pointer 595 set to point to Object Hit -list 350, and its 
Relation 575 set to 425 (for the relationship currently 
being processed). 

In a preferred embodiment, a parent object's struc- 10 
tural relevance score is computed by summing the 
weighted rank scores of its children objects, where chil- 
dren objects are those objects in the set identified in step 
705 to which the parent object is related, as determined 
by Relationship Table 450. A child's weighted rank score '5 
is its score in field 365a multiplied by the relationship 
weight 470a. To produce this end result, formula F mul- 
tiplies Score 365a by Weight 470a and adds that value 
to 365'a. Note that when a new entry 355'a is created, 
the Score field 365'a is initialized to 0. 20 

After all selected child objects have been processed 
in 711 , the parent Object Hit -list 350' will have been up- 
dated for the current relationship. The algorithm then re- 
turns to step 630 in Figure 10. 

Figure 12 consists of Figures 12A and 12B. Figure 25 
12A shows the steps for jsrocess 900 in detail. A hierar- 
chical result display is created by processing the most 
recently created parent Object Hit- list 350'. In step 905 
an iteration over each of the entries 355' in parent Object 
Hit-list 350* is performed, where each entry is processed 30 
as follows in turn by following path 907. When all entries 
have been processed, path 906 is followed. 

In step 910, process 950 (described in detail in Fig- 
ure 12B) is called for the current entry. 

Figure 12B shows the steps for process 950 in de-. 35 
tail. Process 950 operates on an entry 355 from an Ob- 
ject Hit-list 350. In step 955, the current entry 355 is dis- 
played by indenting and printing various attributes (ob- 
tained from the Object Catalog 21 0) for the object iden- 
tified by the current entry. If the Children field 380 for the 40 
current entry 355 is not empty, then an iteration over the 
children identified by the Children Table 580 pointed to 
by field 380a is performed in step 960. At each level of 
the hierarchy, the relationships used for displaying chil- 
dren may be specified. If relationships have been spec- 45 
ified for this level, they are used to filter the children that 
wilt be displayed by requiring that a child's Relation 575 
match one of the specified relationships. Each matching 
child is processed as follows in turn by following path 
962. The current entry being processed will be referred 50 
to as 590a. If there are no more children to process, path 
961 is followed. 

In step 965, the Child Number 585a for the current 
Children Table entry 590a is used to identify the child's 
corresponding hit-list entry 355 by indexing into the 55 
Number field 360 of the Object Hit-list 350 identified in 
Object Hit-list identifier field 595a of entry 590a. This hit- 
list entry 355 is then processed by calling process 950 



recursively tor that entry. 

The result of step 900 is a display of ranked hierar- 
chies where children are shown grouped and indented 
under their parent. An example of such a display is 
shown in Figure 13. 

Figure 13 shows a sample output result of the sys- 
tem. The Figure shows the result ol iterating step 635 
once, such that a two level hierarchy is generated. The 
original topically relevant objects supplied in step 610 
are displayed indented as 1320. The structurally rele- 
vant parent objects found after one iteration of step 635 
are displayed non-indented as 1310. The parent objects 
1 31 0 form the next level of the hierarchical view, provide 
navigational starling points tor browsing the relevant ob- 
jects, and group the topically relevant child objects 
1 320. The display provides the end user with insight into 
the structure of the object collection being searched. At- 
tributes for each of the objects shown in the display are 
obtained from the Object Catalog 210 and Attribute Ta- 
bles 250. 



Claims 

1. A computer system having one or more memories 
and one or more central processing units, the com- 
puter capable of accessing one or more database 
memories, a plurality of objects stored in one; or 
more of the database memories and each object 
identified by an index, the computer system further 
comprising: 

a user interface for entering a query; 
a hit list, stored in one or more of the memories, 
the hit list being a topically relevant set of a plu- 
rality of topically relevant objects, the topically 
relevant set being selected from one or more of 
the database memories and ranked by a topical 
search engine using the index, the rank indicat- 
ing a topical relevance to the query; 
a relationship data structure containing infor- 
mation about how two or more of the objects 
are related to one another by one or more struc- 
tural relationships, the structural relationships 
being directed; and 

a hierarchical view generator that selects a 
structurally relevant set of the objects, each ob- 
ject in the structurally relevant set being a par- 
ent of one or more of the topically relevant ob- 
jects according to one or more of the structural 
relationships, each of the topically relevant ob- 
jects being a child of one or more of the parents, 
the hierarchical view generator organizing each 
of the child objects under one or more of its re- 
spective parents to create a directional hierar- 
chy. 

2. A computer system, as in claim 1 , further compris- 
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ing a display that renders a presentation of one or 
more ot the parents and the child objects associated 
with the respective parent. 

3. A computer system, as in claim 1 or 2, where the 
database memory resides on one or more connect- 
ed computers that are connected to the computer 
system by a network 

4. A computer system, as in claim 3, where the query 
accesses the computer system 1rom the network. 

5. A computer system, as in any preceding claim, 
where the hierarchical view generator subselects a 
subset ot the most relevant topically relevant ob- 
jects according to rank as the topically relevant set. 

6. A computer system, as in any preceding claim, 
where the information in the relationship data struc- 
ture includes weights for the structural relationships 
and the hierarchical view generator subselects a 
subset of the structurally relevant objects as the 
structurally relevant set as a function of the weights. 

7. A computer system, as in any preceding claim, 
where one or more relationship data structures de- 
fining one or more structural relationships are se- 
lected using information in the query. 

8. A computer system, as in any preceding claim, 
where the structural relationships include any one 
or more of the following: the hyperlink relationship, 
the subclass or categorization relationship, the ge- 
ographic location relationship, the location within a 
file system relationship, the conceptual relation- 
ship, and the is-a-part-of relationship. 

9. A computer system as in claim 1 further comprising: 

an iterator in the a hierarchical view generator 
that treats the parents as topically relevant ob- 
jects and runs the hierarchical view generator 
zero or more times, each time creating a parent 
level in the directional hierarchy. 

10. A system, as in claim 9, where a different set of one 
or more structural relationships is selected in one 
or more iterations of the iterator. 

11. A computer system, as in claim 9, further compris- 
ing: 

a display that renders a presentation of one or 
more selected parents in one of the parent lev- 
els and one or more of the objects, being dis- 
played objects, structurally related to one of the 
selected parents and at a lower level than the 
parent level in the directional hierarchy. 



12. A system, as in claim 1 1 , where one or more of the 
displayed objects is suppressed in the display. 

1 3. A system, as in claim 1 2, where the displayed object 
s is suppressed due to its structural relationship with 

the respective parent. 

14. A computer system having one or more memories 
and one or more central processing units, the com- 

io puter capable ol accessing one or more database 
memories, a plurality of objects stored in one or 
more of the database memories and each object 
identified by an index, the computer system further 
comprising: 

15 

a) means for selecting a topically relevant set 
of two or more of the objects; 

b) means for ranking the objects in the topically 
relevant set; 

20 c) means lor identifying one or more structural 

relationships between one or more of the ob- 
jects in the topically relevant -set, being chil- 
dren, and one or more of the objects in the da- 
tabase memories, being parents, the structural 

2$ relationships being directional from the parent 

to the child; and 

d) means for organizing one or more of the chil- 
dren under each of the respective parents in a 
structural hierarchy. 

30 

15. A method for hierarchically grouping a plurality of 
objects stored in one or more database memories 
of a computer system, comprising the steps of: 

35 a) selecting a topically relevant set of two or 

more of the objects; 

b) ranking the objects in the topically relevant 
set; 

c) identifying one or more structural relation- 
<o ships between one or more of the objects in the 

topically relevant set, being children, and one 
or more of the objects in the database memo- 
ries, being parents, the structural relationships 
being directional from the parent to the child; 
4£ and 

d) organizing one or more of the -children under 
each of the respective parents. 

16. A method, as in claim 15, further comprising the 
50 steps of: 

e) -deselecting the children and identifying the 
parents as children; and 

f) repeating steps c and d zero or more times, 
55 each time -creating a next hierarchical level. 
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NETSCAPE - [IBM PLANETWIDE SEARCH RESULTS] DOUg ) 
File Edit View Go Bookmorks Options Directory Window Help 



YOUR QUERY WAS: INSTALLING WINDOWS 95 

• THE BSC WSOC WINDOWS 95 GUIDE ~ 1310 , 

(HYPERLINK) 1 

tfr-frfr [81] 3-WSTALUNG WINDOWS 95 
(HYPERLINK) 

-fr-fr-fr [71] 4HNF0RMATKJN ON SPECIFIC SYSTEMS AND HARDWARE 
(HYPERLINK) 

■itittt [69] 1-WTR0DUCT10N ( READ THIS FIRST! ) 

(HYPERLINK) , 

• 1996 UPDATES — ^~~ 1310 1 

(HYPERLINK) 

Gtt [69] 1 -INTRODUCTION ( READ THIS FIRST! ) v 
(HYPERUNK) 

[67] 2-WSTALUNG AND CONFIGURING WATSON SYSTEMS 
(HYPERUNK) 

"A-A" [59] 10- WINDOWS 95 RESOURCES 

(HYPERUNK) y 

• NETWORK ADAPTERS ^10 r 

(HYPERUNK) 

"fr-fc-fr [71] 4HNF0RMATI0N ON SPECIFIC SYSTEMS AND HARDWARE 
(HYPERUNK) 

-fr-fr-fr [71] THE BSC WSOC WINDOW 95 GUIDE 
(HYPERUNK) 

• THINKPAD 701 ■BUTTERFLY" SYSTEMS 1310 
(HYPERUNK) £j 
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